KANDA DATA

  • Home
  • About Us
  • Contact
  • Sitemap
  • Privacy Policy
  • Disclaimer
  • Bimbingan Online Kanda Data
Menu
  • Home
  • About Us
  • Contact
  • Sitemap
  • Privacy Policy
  • Disclaimer
  • Bimbingan Online Kanda Data
Home/Data Analysis in R/How to Perform Residual Normality Analysis in Linear Regression Using R Studio and Interpret the Results

Blog

1,782 views

How to Perform Residual Normality Analysis in Linear Regression Using R Studio and Interpret the Results

By Kanda Data / Date Nov 11.2024 / Category Data Analysis in R

Residual normality testing is a key assumption check in linear regression analysis using the Ordinary Least Squares (OLS) method. One essential requirement of linear regression is that the residuals should follow a normal distribution. In this article, Kanda Data shares a tutorial on how to perform residual normality analysis in linear regression using R Studio, along with steps to interpret the results.

Let’s use the following case study as an example to practice residual normality testing in R Studio. The study aims to examine the effect of advertising costs and the number of marketing staff on product sales volume. Based on this objective, the equation specification can be written as follows:

π‘Œ=𝛽0+𝛽1𝑋1+𝛽2𝑋2

Where:

π‘Œ is product sales (in thousand units) as the dependent variable,

𝑋1 is advertising cost (in hundreds of US dollars) as the first independent variable,

𝑋2 is the number of marketing staff (in number of employees) as the second independent variable,

𝛽1 and 𝛽2 are regression coefficients,

πœ– represents the error or residual. For this exercise, data from 15 observations were collected for the variables Sales, Advertising Cost, and Marketing Staff. Details of the data are presented in the table below:

Steps to Perform Residual Normality Testing in Linear Regression Analysis

First, download and install the R application on your laptop. After successfully setting up R Studio, you’ll need to conduct multiple linear regression analysis before proceeding with residual normality testing.

After opening R Studio, import the data for analysis. There are two methods to do this: (a) Directly import data from Excel; and (b) Manually input the data using commands in R Studio.

For detailed instructions on importing data from Excel or manual input methods, refer to our previous articles.

Run the following command to perform the regression analysis:

model <- lm(Sales ~ Advertising_Cost + Marketing_Staff, data = data)

summary(model)

Press Enter or click Run to generate the regression analysis output, which should look similar to the following:

As mentioned earlier, residual normality testing ensures that the residuals from the regression equation in this case study follow a normal distribution.

You can use the Shapiro-Wilk test or a QQ plot for this purpose. In this tutorial, we’ll use the Shapiro-Wilk test. Run the following command in R Studio:

shapiro.test(residuals(model))

The output of the Shapiro-Wilk test will look like this:

Shapiro-Wilk normality test

data:Β  residuals(model)

W = 0.94428, p-value = 0.4393

The analysis shows a W value of 0.94428 and a p-value of 0.4393. Since the p-value is greater than 0.05, we conclude that the residuals are normally distributed, and the null hypothesis is accepted.

This concise tutorial is brought to you by Kanda Data. We hope this guide proves valuable to our readers. Stay tuned for more updates and articles from us!

Tags: Data Analysis, Kanda data, linear regression analysis, R programming guide, R Studio Tutorial, Regression Assumptions, residual normality test, Shapiro-Wilk Test

Related posts

Alternative to the t-test When Data Are Not Normally Distributed

Date Feb 09.2026

When Should Natural Logarithmic Data Transformation Be Applied?

Date Feb 02.2026

Should Data Normality Testing Always Be Performed in Statistical Analysis?

Date Jan 26.2026

Categories

  • Article Publication
  • Assumptions of Linear Regression
  • Comparison Test
  • Correlation Test
  • Data Analysis in R
  • Econometrics
  • Excel Tutorial for Statistics
  • Multiple Linear Regression
  • Nonparametric Statistics
  • Profit Analysis
  • Regression Tutorial using Excel
  • Research Methodology
  • Simple Linear Regression
  • Statistics

Popular Post

February 2026
M T W T F S S
 1
2345678
9101112131415
16171819202122
232425262728  
« Jan    
  • Alternative to the t-test When Data Are Not Normally Distributed
  • When Should Natural Logarithmic Data Transformation Be Applied?
  • Should Data Normality Testing Always Be Performed in Statistical Analysis?
  • Differences in Nominal, Ordinal, Interval, and Ratio Data Measurement Scales for Research
  • Reasons Why the R-Squared Value in Time Series Data Is Higher Than in Cross-Section Data
Copyright KANDA DATA 2026. All Rights Reserved